search term
TikTok Shop Showed Me Search Suggestions for Products With Nazi Symbolism
Even after TikTok removed swastika jewelry from its online shop, I was algorithmically nudged toward a web of Nazi-related products during searches, like "double lightning bolt" and "ss" necklaces. My journey on TikTok Shop started out with a search for "hip hop jewelry." It's an innocuous search query multiple users have likely typed in, hoping to find something to wear. While browsing the cheap jewelry, I was struck by what TikTok's algorithm repeatedly suggested that I might also be interested in: jewelry with blatant Nazi symbolism. TikTok continues to struggle with moderation as its in-app ecommerce store gains traction with younger users.
- South America > Venezuela > Capital District > Caracas (0.05)
- North America > United States > Minnesota (0.05)
- North America > United States > Indiana (0.05)
- (5 more...)
- Information Technology > Services (1.00)
- Retail (0.98)
Cats love to massacre bugs, and scientists have the videos to prove it
Breakthroughs, discoveries, and DIY tips sent every weekday. Nearly one in three U.S. households harbor a cold-hearted killer. Some even have a well-known proclivity for torture. And while the popular pets are best known for downing birds and cornering mice, they are also adept at hunting all manner of bugs. Host a cat in your home long enough and you'll likely become accustomed to regular deliveries of amputated insect legs, wings, or the occasional whole carcass.
- South America > Brazil (0.05)
- Oceania > New Zealand (0.05)
- North America > United States > New York (0.05)
- (3 more...)
- Retail (0.48)
- Information Technology (0.48)
LA VIB: A Large-scale Video Interpolation Benchmark Appendix
Terms are grouped to five main types including location, activities, weather, misc, and camera types. Locations and activities include two levels of hierarchies. The structure of search terms changes based on the selected sub-group. Using an exhaustive list of locations is not feasible given the search space. 'scenic' as actions are less relevant when the locations are broad.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > South Korea > Seoul > Seoul (0.05)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- (38 more...)
On Developers' Self-Declaration of AI-Generated Code: An Analysis of Practices
Kashif, Syed Mohammad, Liang, Peng, Tahir, Amjed
AI code generation tools have gained significant popularity among developers, who use them to assist in software development due to their capability to generate code. Existing studies mainly explored the quality, e.g., correctness and security, of AI-generated code, while in real-world software development, the prerequisite is to distinguish AI-generated code from human-written code, which emphasizes the need to explicitly declare AI-generated code by developers. To this end, this study intends to understand the ways developers use to self-declare AI-generated code and explore the reasons why developers choose to self-declare or not. We conducted a mixed-methods study consisting of two phases. In the first phase, we mined GitHub repositories and collected 613 instances of AI-generated code snippets. In the second phase, we conducted a follow-up practitioners' survey, which received 111 valid responses. Our research revealed the practices followed by developers to self-declare AI-generated code. Most practitioners (76.6%) always or sometimes self-declare AI-generated code. In contrast, other practitioners (23.4%) noted that they never self-declare AI-generated code. The reasons for self-declaring AI-generated code include the need to track and monitor the code for future review and debugging, and ethical considerations. The reasons for not self-declaring AI-generated code include extensive modifications to AI-generated code and the developers' perception that self-declaration is an unnecessary activity. We finally provided guidelines for practitioners to self-declare AI-generated code, addressing ethical and code quality concerns.
- Asia > China > Hubei Province > Wuhan (0.04)
- South America > Brazil (0.04)
- Oceania > New Zealand > North Island > Manawatū-Whanganui > Palmerston North (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > South Korea > Seoul > Seoul (0.05)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- (38 more...)
On Using Large Language Models to Enhance Clinically-Driven Missing Data Recovery Algorithms in Electronic Health Records
Lotspeich, Sarah C., Collins, Abbey, Wells, Brian J., Khanna, Ashish K., Rigdon, Joseph, McGowan, Lucy D'Agostino
Objective: Electronic health records (EHR) data are prone to missingness and errors. Previously, we devised an "enriched" chart review protocol where a "roadmap" of auxiliary diagnoses (anchors) was used to recover missing values in EHR data (e.g., a diagnosis of impaired glycemic control might imply that a missing hemoglobin A1c value would be considered unhealthy). Still, chart reviews are expensive and time-intensive, which limits the number of patients whose data can be reviewed. Now, we investigate the accuracy and scalability of a roadmap-driven algorithm, based on ICD-10 codes (International Classification of Diseases, 10th revision), to mimic expert chart reviews and recover missing values. Materials and Methods: In addition to the clinicians' original roadmap from our previous work, we consider new versions that were iteratively refined using large language models (LLM) in conjunction with clinical expertise to expand the list of auxiliary diagnoses. Using chart reviews for 100 patients from the EHR at an extensive learning health system, we examine algorithm performance with different roadmaps. Using the larger study of $1000$ patients, we applied the final algorithm, which used a roadmap with clinician-approved additions from the LLM. Results: The algorithm recovered as much, if not more, missing data as the expert chart reviewers, depending on the roadmap. Discussion: Clinically-driven algorithms (enhanced by LLM) can recover missing EHR data with similar accuracy to chart reviews and can feasibly be applied to large samples. Extending them to monitor other dimensions of data quality (e.g., plausability) is a promising future direction.
- North America > United States > North Carolina > Forsyth County > Winston-Salem (0.14)
- South America (0.14)
- North America > United States > Texas > Harris County > Houston (0.04)
- (9 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.93)
Revisiting Formal Methods for Autonomous Robots: A Structured Survey
Azaiez, Atef, Anisi, David A., Farrell, Marie, Luckcuck, Matt
This paper presents the initial results from our structured literature review on applications of Formal Methods (FM) to Robotic Autonomous Systems (RAS). We describe our structured survey methodology; including database selection and associated search strings, search filters and collaborative review of identified papers. We categorise and enumerate the FM approaches and formalisms that have been used for specification and verification of RAS. We investigate FM in the context of sub-symbolic AI-enabled RAS and examine the evolution of how FM is used over time in this field. This work complements a pre-existing survey in this area and we examine how this research area has matured over time. Specifically, our survey demonstrates that some trends have persisted as observed in a previous survey. Additionally, it recognized new trends that were not considered previously including a noticeable increase in adopting Formal Synthesis approaches as well as Probabilistic Verification Techniques.
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- Europe > Ireland (0.04)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
- (3 more...)
- Overview (1.00)
- Research Report (0.84)
- Information Technology (0.68)
- Transportation > Infrastructure & Services (0.46)
Decomposed Reasoning with Reinforcement Learning for Relevance Assessment in UGC Platforms
Yuan, Xiaowei, Jin, Lei, Zhang, Haoxin, Gao, Yan, Wu, Yi, Hu, Yao, Huang, Ziyang, Zhao, Jun, Liu, Kang
Retrieval-augmented generation (RAG) plays a critical role in user-generated content (UGC) platforms, but its effectiveness depends heavily on accurate relevance assessment of query-document pairs. Despite recent advances in applying large language models (LLMs) to relevance modeling, UGC platforms present unique challenges: 1) ambiguous user intent due to sparse user feedback in RAG scenarios, and 2) substantial noise introduced by informal and unstructured language. To address these issues, we propose the Reinforced Reasoning Model for Relevance Assessment (R3A), which introduces a decomposed reasoning framework over queries and candidate documents before scoring. R3A first leverages auxiliary high-ranked documents within the platform to infer latent query intent. It then performs verbatim fragment extraction to justify relevance decisions, thereby reducing errors caused by noisy UGC. Based on a reinforcement learning framework, R3A is optimized to mitigate distortions arising from ambiguous queries and unstructured content. Experimental results show that R3A significantly outperforms existing baseline methods in terms of relevance accuracy, across both offline benchmarks and online experiments.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Italy (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
TUM-MiKaNi at SemEval-2025 Task 3: Towards Multilingual and Knowledge-Aware Non-factual Hallucination Identification
Anschütz, Miriam, Gikalo, Ekaterina, Herbster, Niklas, Groh, Georg
Hallucinations are one of the major problems of LLMs, hindering their trustworthiness and deployment to wider use cases. However, most of the research on hallucinations focuses on English data, neglecting the multilingual nature of LLMs. This paper describes our submission to the SemEval-2025 Task-3 - Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes. We propose a two-part pipeline that combines retrieval-based fact verification against Wikipedia with a BERT-based system fine-tuned to identify common hallucination patterns. Our system achieves competitive results across all languages, reaching top-10 results in eight languages, including English. Moreover, it supports multiple languages beyond the fourteen covered by the shared task. This multilingual hallucination identifier can help to improve LLM outputs and their usefulness in the future.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (4 more...)
- Workflow (0.46)
- Research Report (0.40)
Can Artificial Intelligence Generate Quality Research Topics Reflecting Patient Concerns?
Kim, Jiyeong, Chen, Michael L., Rezaei, Shawheen J., Ramirez-Posada, Mariana, Caswell-Jin, Jennifer L., Kurian, Allison W., Riaz, Fauzia, Sarin, Kavita Y., Tang, Jean Y., Asch, Steven M., Linos, Eleni
Patient-centered research is increasingly important in narrowing the gap between research and patient care, yet incorporating patient perspectives into health research has been inconsistent. We propose an automated framework leveraging innovative natural language processing (NLP) and artificial intelligence (AI) with patient portal messages to generate research ideas that prioritize important patient issues. We further quantified the quality of AI-generated research topics. To define patient clinical concerns, we analyzed 614,464 patient messages from 25,549 individuals with breast or skin cancer obtained from a large academic hospital (2013 to 2024), constructing a 2-staged unsupervised NLP topic model. Then, we generated research topics to resolve the defined issues using a widely used AI (ChatGPT-4o, OpenAI Inc, April 2024 version) with prompt-engineering strategies. We guided AI to perform multi-level tasks: 1) knowledge interpretation and summarization (e.g., interpreting and summarizing the NLP-defined topics), 2) knowledge generation (e.g., generating research ideas corresponding to patients issues), 3) self-reflection and correction (e.g., ensuring and revising the research ideas after searching for scientific articles), and 4) self-reassurance (e.g., confirming and finalizing the research ideas). Six highly experienced breast oncologists and dermatologists assessed the significance and novelty of AI-generated research topics using a 5-point Likert scale (1-exceptional, 5-poor). One-third of the AI-suggested research topics were highly significant and novel when both scores were lower than the average. Two-thirds of the AI-suggested topics were novel in both cancers. Our findings demonstrate that AI-generated research topics reflecting patient perspectives via a large volume of patient messages can meaningfully guide future directions in patient-centered health research.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.54)
- Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.36)